Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet2: use a gamma distribution to pick fake outs #3528

Merged
merged 1 commit into from
Aug 15, 2018

Conversation

moneromooo-monero
Copy link
Collaborator

as per "An Empirical Analysis of Linkability in the Monero
Blockchain", by Miller et al.

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anyone confirmed the findings of this paper? Is the gamma distribution actually as described? I see no reason to suspect intentionally incorrect findings, and the conclusions seem correct, but I also haven't run the same analysis on the blockchain.

for (size_t idx: selected_transfers)
if (m_transfers[idx].is_rct())
has_rct = true;
bool has_rct_distribution = has_rct && get_rct_distribution(rct_start_height, rct_offsets);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be const, and probably shouldn't be modified after this point.

THROW_WALLET_EXCEPTION_IF(resp_t.status == CORE_RPC_STATUS_BUSY, error::daemon_busy, "get_output_histogram");
THROW_WALLET_EXCEPTION_IF(resp_t.status != CORE_RPC_STATUS_OK, error::get_histogram_error, resp_t.status);
if (!m_transfers[idx].is_rct() || !has_rct_distribution)
req_t.amounts.push_back(m_transfers[idx].is_rct() ? 0 : m_transfers[idx].amount());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ternary operation should be pointless now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because amount() will return 0 for rct ? Yes, that seems right, but there are other places where the amount is known (when you're sending) and I feel a bit uneasy not having this explicit case here if it inspires other place. But I think technically you're right.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that m_amount (which is the same as amount()) for rct outputs is the decoded amount, not zero.

td.m_amount = tx_scan_info[o].amount;

uint64_t block_offset = x / DIFFICULTY_TARGET_V2; // this assumes constant target over the whole rct range
if (block_offset >= rct_offsets.size() - 1)
return std::numeric_limits<uint64_t>::max(); // bad pick
block_offset = rct_offsets.size() - 2 - block_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this select outputs that are still locked? The rct_offsets looks to be the up to the latest block, and I think the num_outs call is filtering away locked outputs below. I think this is correct behavior but I wanted to start some discussion based on the recommendations from the paper.

Specifically my question is whether the gamma distribution should be over just the unlocked outputs, or over all outputs. It looks like the paper just tracked the spend time of 0 and 1 mixin outputs, and noted that the time distribution was different than Bitcoin. Which is to be expected, since Bitcoin outputs do not have a lock period. So again, I think the authors intended the parameters to be used over all outputs, but I was curious whether anyone else had information on this topic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can pick locked outputs indeed, and they're rejected later. This is necessary also because coinbase outs are locked for 60 blocks, so all locked outs aren't necessarily at the end. In any case, I think using one or the other would yield a very similar distribution anyway (just a feeling, no evidence).

@moneromooo-monero
Copy link
Collaborator Author

One thing I've confirmed is that the (pre-dupe-rejection) distribution gets us roughly half of pick within the last 1.8 days. Those 1.8 days were estimated from the paper's graph though, but it means the code seems to match the claimed distribution, at least.

bool has_rct = false;
for (size_t idx: selected_transfers)
if (m_transfers[idx].is_rct())
has_rct = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't break the cycle after the condition is met? no need to iterate further

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason either way.

double min() const { return 0.0; }
double max() const { return 1.0; }
double operator()() { return crypto::rand<uint64_t>() / (double)std::numeric_limits<uint64_t>::max(); }
} engine;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives the following compile error:

In file included from /Users/buildbot/monero-builder/monero-static-osx-10_13/build/src/wallet/wallet2.cpp:32:
/Library/Developer/CommandLineTools/usr/include/c++/v1/random:3643:51: error: call to non-static member function without an object argument
    const size_t __logR = __log2<uint64_t, _URNG::max() - _URNG::min() + uint64_t(1)>::value;
                                           ~~~~~~~^~~
/Library/Developer/CommandLineTools/usr/include/c++/v1/random:4167:30: note: in instantiation of function template specialization 'std::__1::generate_canonical<double, 53, gamma_engine>' requested here
                      _VSTD::generate_canonical<result_type,
                             ^
...

because the argument to the distribution sampler needs to satisfy the UniformRandomBitGenerator requirements. The following eliminates the error:

    struct gamma_engine
    {
      typedef uint64_t result_type;
      static constexpr uint64_t min() { return 0; }
      static constexpr uint64_t max() { return std::numeric_limits<uint64_t>::max(); }
      uint64_t operator()() { return crypto::rand<uint64_t>(); }
    } engine;

if (amount == 0 && has_rct_distribution)
{
// gamma distribution
do i = pick_gamma(); while (i >= num_outs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the pre/post-fork segregation is deactivated for this scheme. Is this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not, it was coded before that change.

@moneromooo-monero moneromooo-monero force-pushed the gamma branch 2 times, most recently from b722e20 to 24b2313 Compare July 2, 2018 15:23
struct gamma_engine
{
uint64_t min() const { return 0; }
uint64_t max() const { return std::numeric_limits<uint64_t>::max(); }
Copy link
Contributor

@stoffu stoffu Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min and max need to be static according to the specification. OSX builds are failing: https://build.getmonero.org/builders/monero-static-osx-10.13/builds/762/steps/compile/logs/stdio

constexpr seems also necessary, because I had to add it to make it compile for OSX.

if (amount == 0 && has_rct_distribution)
{
// gamma distribution
if (num_found -1 < recent_outputs_count + pre_fork_outputs_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case recent_outputs_count is known to be 0, right? Why bother adding?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it matches the ordering so it's a lot clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@moneromooo-monero
Copy link
Collaborator Author

moneromooo-monero commented Jul 3, 2018

Does it buld for you without typedef uint64_t result_type; ? Adding this here makes CLANG error out with "error: unused typedef 'result_type' [-Werror,-Wunused-local-typedef]". Alternatively, I could replace the uint64_t in the return types with result_type though, if that also works for you.

Edit: I've just pushed a version with that second alternative.

as per "An Empirical Analysis of Linkability in the Monero
Blockchain", by Miller et al.
@stoffu
Copy link
Contributor

stoffu commented Jul 3, 2018

Not adding result_type does work, yes.

@luigi1111 luigi1111 merged commit 34d4b79 into monero-project:master Aug 15, 2018
luigi1111 added a commit that referenced this pull request Aug 15, 2018
34d4b79 wallet2: use a gamma distribution to pick fake outs (moneromooo-monero)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants